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Abstract 

o 

The stochastic matching problem was first introduced by Chen, Immorlica, Karlin, 
Mahdian, and Rudra (ICALP 2009). They presented greedy algorithm together with 
an analysis showing that this is a 4-approximation. They also presented modification of 
this problem called multiple-rounds matching, and gave 0(log n)-approximation algorithm. 
Many questions were remaining after this work: is the greedy algorithm a 2-approximation, 
is there a constant-ratio algorithm for multiple-rounds matching, and what about weighted 
graphs? For the last two problems constant-factor approximations were given in the work 
U ■ of Bansal, Gupta, Nagarajan, and Rudra, and in the work of Li and Mestre. In this paper 

c/2 , we are answering to the first question by showing that the greedy algorithm is in fact a 

2-approximation. 

v^ ■ 1 Introduction 

m 

^ ! 1.1 Problem statement 

We are given an undirected graph G = (V, E) in which every edge uv £ E is assigned a real number 
< Puv < 1- Every vertex v £ V has a positive integer number t v , called a patience number, associated 
with it. In every step we can probe any edge uv £ E, but only if t u > and t v > 0. Probing edge uv 
will end with success with probability p uv , and after that vertices u, v will be removed from the graph 
with all edges going out of them. In case of a failure, which has probability 1 — p uv , edge uv is removed 
from graph, and patience numbers t u ,t v are decreased by 1. Results of all probes are independent. If 
after a certain step the patience number t v of vertex v becomes 0, we remove vertex v with all edges 
incident to it. Our goal is to maximize the average number of successful probes. 

An instance of our problem will be pair (G, t) - undirected weighted graph G with patience numbers 
t v for every vertex v E G. Set of edges which were probed successfully forms a matching. We will call 
such edges as taken into matching. 
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1.2 Motivation 

We will give a motivations for this problem. More detailed justification of the model can be found in 

ID- 

Kidney Exchange A patient waiting for a kidney may have a donor, a friend or a family member, 
who would like to donate him a kidney. But it may happen that the patient has incompatible donors 
among his friends and family. If we consider many such patient-donor pairs, it may be possible to 
find two such pairs in which donor from the first pair may donate a kidney to a second patient, and 
the second donor may donate kidney to a first patient. Performing such operations on four people is 
called kidney exchange. To know whether two pairs can perform kidney exchange, we need to make a 
set of three tests. The first two tests are easy to make, e.g. they include blood-type tests. The third 
test is relatively hard, and checks possibility of exchange only between two particular pairs. Also it is 
required to perform exchange as soon as two pairs were matched. Thus we can not check every two 
pairs to see whether they can exchange or not. However, we can use the results of first two tests to 
estimate the probability of a successful result in the third test. Moreover, every patient can only be 
put to a limited number of tests, depending on his health. The goal is to perform maximal number of 
transplants. 

Dating Let us consider a dating portal, where users are offered to make acquaintances with other 
users. On the basis of personal details of each user we can estimate a probability of matching for each 
pair of users. We can suggest a meeting to a pair of users, but only after the meeting we can say if 
they like each other. In case of a successful matching, we assume they will leave the portal. While 
patience of each user is an individual case, every user will give up after several unsuccessful meetings 
and will stop using the portal. The objective is to suggest this acquaintances to the users in the way 
which maximize the number of satisfied couples. 

1.3 Related work 

Problem we are considering belongs to the field of stochastic optimization. It includes many of well 
known problems to which some elements of uncertainty were added. Since matchings are very popular 
combinatorial object used in models from many areas, lots of applications motivate stochastic matching 
problems, e.g. the classic problem of online bipartite matching, solved by Karp et. al. [5j, has evolved 
in many ways to face needs of online advertising [6j [7] . In those problems we are given one side of 
bipartite graph (advertisers), and each vertex from the other side (ad impression) is revealed to us one 
by one together with edges coming out of it. Right after arrival of a vertex we need to match it with 
some vertex from the first side. Although this problem is not precisely related to ours, Bansal et. al. 
[3] gave an online variant of our stochastic matchings problem which can be seen as a generalization 
of online bipartite matching. 

First work concerning our problem was the paper of Chen et. al. [lj. This work introduced 
a stochastic matching problem together with its modification - multiple-rounds matching. In this 
modified version we are given a fixed number of rounds, and in each we are allowed to probe any set of 
edges which forms a matching. They showed that greedy algorithm for the main version of the problem 
is a 4-approximation for all graphs. For the multiple-rounds version they gave O(logn) approximation. 
They also proved that finding an optimal multiple-rounds strategy is NP-hard. Authors also asked 
about weighted version of this problem in which every edge has assigned real and positive weight. 



Here the goal is to maximize the expected weight of taken matching. Unfortunately, modification of 
the greedy algorithm which probes edges sorted by its product of weight and probability does not 
work, and can not give any constant-factor approximation. 

Recently Bansal et. al. [3] and Li with Mestre [2] showed how linear programming can be used 
to obtain constant approximation ratio for stochastic matchings and multiple-rounds matching. A 
very huge advantage of this approach is that considering weighted edges, almost does not make any 
difference. 

Bansal et. al. obtained 5.75 ratio for weighted case. They also showed LP-based analysis to prove 
that greedy algorithm is a 5-approximation. For multiple-rounds matching they gave a constant- 
factor approximation breaking O(logn) ratio. They also introduced two new modifications. First is 
an online version which can be seen as a generalization of online bipartite matching mentioned earlier. 
Second modification considers case when we are probing hyperedges. For both variants constant-ratio 
approximations were given. 

Li and Mestre also gave a constant-factor approximations for weighted and multiple-rounds ver- 
sions. They also improved bounds for the basic version of the problem. In special case, when all 
probabilities are equal and graph is bipartite, they got a factor x ^ e -\ ~ 2.3131. When probabilities 
can be various their algorithm gives ratio worse than 4, but it can be combined with greedy algorithm 
to get a 3.51 factor for bipartite graphs and 3.88 for all graphs. 

1.4 Our results 

In our paper we are focused on the main version of stochastic matching problem, and we do not consider 
multiple-rounds model nor weighted graphs. The main result of this paper is improved analysis of 
the greedy algorithm showing that this is a 2-approximation which confirms the hypothesis stated by 
Chen et. al. 

2 Preliminaries 

If in a certain step any algorithm probes edge coming out of vertex a € G, then we will say that 
algorithm probes vertex a. 

We consider any algorithm deterministic, if probe in each step is unambiguous and depends only 
on previous steps. 

If ALG denotes an algorithm, then ~EALG is expected number of edges taken into matching by 
this algorithm. Moreover, we will use (G , t ) to denote the instance on which ALG is executed. 

Decision tree Each deterministic algorithm ALG can be represented by its decision tree Talg 
(which has an exponential size). Each node of that tree corresponds to probing an edge. Node 
v £ Talg is assigned a value p v equal to p a p where edge a/3 is probed in v. The left subtree of node 
v G Talg represents proceeding of algorithm after a successful probe in node v, the right subtree - 
after failure. More precisely: 

• the left subtree corresponds to an algorithm on the instance (G \ {ce,/3},£) 

• the right subtree corresponds to an algorithm on the instance (G \ {(a, f3)},t') where t' a = 
t a — l,t'g = tp — 1 and t'^ = t 7 for other vertices 7 



Probability of reaching a node v € Talg wn l be denoted q v . The performance of an algorithm 
ALG can be expressed using the decision tree: 






EALG = > q v p 
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We will denote above sum as ETalg- For a node v E Talg-, we denote as T(v ) subtree of Talg rooted 
in v, and as L(v),R(v) its left and right subtree respectively. 

Optimal algorithm The optimal algorithm on instance (G, t) will be denoted OPT(G) - we 
will not use OPT(G,t), because it will always be clear which patience numbers we are using. We 
can assume without loss of generality that OPT(G) is deterministic. This lets us represent optimal 
algorithm by a decision tree. We will also assume that every subtree of the tree Topt representing 
optimal algorithm is optimal on its instance, even when probability of reaching such subtree is zero. 

Greedy algorithm Let us consider the greedy algorithm for a given graph G. It will be denoted 
as GRD{G). 
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Empty graph It is reasonable to consider also an empty graph, because it may appear during our 
inductive proof. Of course in this case performance of any algorithm is zero. 

3 Analysis of the greedy algorithm 

We will start with an important lemma from [lj (Lemma 3.1.). 

Lemma 3.1. For any node v € Topt, ET(v) < EL(v) + 1. 

Proof. Algorithm which follows R(v) is a proper algorithm for instance on which T(v) works, so 
ER(v) < ET(u ) - because every subtree of Topt is optimal. Hence 

ET(v) =p v {l + EL(v)) + (1 - p v )ER(v) < p v (l + EL(v)) + (1 - p v )ET{v). 

This gives p e ET(v) < p e (l + EL(v)), and finally ET(v) < 1 + EL(v). 

D 

The following theorem is the main result of this paper. 

Theorem 3.2. Greedy algorithm for any instance (G,t) of stochastic matching problem is a 2- 
approximation, i.e. EOPT{G) < 2EGRD(G). 



The sketch of the proof is as follows. Proof is inductive with respect to subinstances (G',t') 
of problem (G, t) where G' is a subgraph of G and t' v < t v for every vertex v S G' . Using optimal 
algorithm on instance (G, t), we derive two algorithms - first on instance (G GRD , t GRD ) and second on 
(G Rgrd ,t RcRD ). We do not know explicitly how those algorithms work. We upperbound performance 
of OPT(G) by those algorithms and some residues. Then we use inductive assumption to bound 
performance of those derived algorithms by greedy algorithms, and show that the residues are "small". 

Proof. The case when graph has no edges is trivial. So suppose it has at least one edge. Let a/3 be 
the first edge probed by the greedy algorithm. Denote as Lqrd,Rgrd algorithms which follow left 
and right subtree of Tqrd respectively. When it does not make a problem, we use OPT instead of 
OPT{G). 

Algorithm for instance (G Lgrd , t L ° RD ) In this paragraph we will derive algorithm proper for 
instance (G Lgrd ,t LaRD ). We will not obtain it directly from OPT, but from algorithm OPT' which 
also works on the whole instance (G,t). 

Let X be the set of nodes of Topt which correspond to probing edge a/3. Define algorithm OPT' 
which follows algorithm OPT(G), until it reaches a node x £ X. After reaching that node OPT' 
probes edge a/3, but then goes straight to the subtree L(x) regardless of the probe's result, i.e. after 
probe it runs as if the result was successful. This means also that OPT' after probing a/3 will not 
probe a nor (3 again. In a node x £ X the expected number of edges taken in the subtree T(x) 
of Topt is equal to ET(x) = p a p + p a pEL(x) + (1 — p a ^)ER(x), but in the subtree T(x) of Topt 1 
it is WT{x) = pa/3 + EL(x) - because OPT' goes to L(x) regardless of the result. We can bound 
performance of OPT using OPT': 

Lemma 3.3. 

EOPT < EOPT' + (1 - p a i3)F(OPT probes a/3). (1) 

Proof. From decision tree of OPT we get 

EOPT = J2 q vPv + J2 qx^T(x). 

veTop T \T(x) xex 

After using lemma I3TT1 we obtain 

EOPT < Yl QvPv + S qx{l + EL ( X ))- 

v&T p T \T{X) x£X 

The performance of OPT' is equal to 

EOPT' = J2 Wv + Yl Q* fr«/» + EL ( X ) ) ' 

v&T OPT \T{X) xeX 

so 

EOPT < J2 q vPv + Y &(1 + EL (%)) 

veT OPT \T(X) x£X 

= Y QvPv + ^2 Qx(Pa/3 + EL(x)) + Y fe(l - Pa/3) = EOPT' + (1 - p a p) ^ q x . 

v£T OPT \T(X) x£X xeX x£X 

All we need now is to notice that ^ q x = ¥(OPT probes a/3). □ 

x&X 



Now we will use algorithm OPT' to construct algorithm for instance {G GRD , t GRD ) . Let algo- 
rithm ALGl follow all moves of OPT', but only those which does not probe vertices a and /3 - when 
algorithm OPT 1 probes vertex a or /3, then algorithm ALGl does not probe any edge, and waits for 
result of OPT'. We can also define ALGl using decision trees. ALGl follows decision tree of OPT' , 
but upon reaching a node v, which probes vertex a or f3, it flips a coin, and with probability p v it goes 
to the left subtree L(v), and with probability 1 — p v it goes to the right subtree R(v). This (random- 
ized) algorithm is a proper algorithm for instance (G GRD ,t GRD ), because graph G GRD is made from 
G by removing vertices a and /3. Moreover, for every vertex v € G Lgrd we have t^ GRD = t v . Perfor- 
mance of algorithm ALGl is equal to the performance of OPT' minus penalty for skipped probes - 
let Rl denote this penalty. Hence 

EOPT' = EALGl + ER L . (2) 

Let us look at Rl under two conditions - when OPT' probes a/3, and when it does not probe 
that edge. If OPT' probes edge a/3, then with probability p a p it will take this edge, and after that 
all probes of OPT' will be valid probes for ALGl, because they will not probe a nor /3. Thus the 
penalty in that case is equal to E(Rl\OPT' probes a/3) = p a p. If OPT' does not probe a/3, then the 
penalty is equal to the expected number of edges incident to a/3 taken by OPT' under this condition: 

E{Rl\OPT does not probe a/3) 
=P(OPT' takes a\OPT' does not probe a/3) + P(OPT' takes f3\OPT' does not probe a/3). 

Thus the whole ERl is equal to 

P(OPT' probes a/3)p a/3 + F{OPT' does not probe a/3)(P(OPT' takes a\OPT' does not probe a/3) 

+P(OPT' takes f3\OPT' does not probe a/3)). 

From the definition, OPT' works just like OPT, unless it reaches a/3, so in place of above expression 
we can write 

P(OPT probes af3)p a p + P(OPT does not probe a/3)(P(OPT takes a\OPT does not probe a/3) 

+P(OPT takes (3\OPT does not probe a/3)). 

We will introduce shorter notation. Denote the event of probing a/3 by OPT as "probe a/3", and 
"-■probe a/3" as opposite event. The event of taking a (or (3) by OPT under the condition of not 
probing a/3 will be denoted as "take a|-iprobe a/3" ( or "take /3|-iprobe a/3"). Thus we can write that 

ER L = P(probe a/3)p a/3 + P(-. probe a/3) (P(take ah probe a/3) + P(take /3|-. probe a/3)) . (3) 

Joining all gives: 

EOPT < EOPT' + (1 - p a/3 )P(probe a/3) by © 

= EALGl + ^Rl + (1 - p Q/3 )P(probe a/3) by © 
= EALGl + (1 - p a/ 3)P(probe a/3) + P(probe a/3)p a/3 

+ P(-. probe a/3)(P(take ah probe a/3) + P(take /3|-. probe a/3)) Y U ' 



Finally we get that 

EOPT < EALGl + P(probe a/3) 

+ P(-. probe a/3)(P(take ah probe a/3) + P(take /3|-. P r obe a/3)). 



(4) 



Algorithm for instance (G R ° RD , t R ° RD ) Instance (G R ° RD , t R ° RD ) is made of (G, t) by removing 
edge a/3 and decreasing patience numbers of a and j3, i.e. t RaRD = t a — 1 and ta ORD = tp — l. Define 
algorithm ALGr on instance (G GRD ,t GRD ) which follows OPT(G), unless OPT(G) probes edge 
which ALGr can not. If OPT(G) makes such probe, then ALGr does not do anything, waits for 
OPT(G) and follows it further. Definition with decision tree: ALGr follows decision tree of OPT, 
but upon reaching node v for which it cannot make a probe it flips a coin and with probability p v it 
goes to the left subtree L(v), and with probability 1 — p v it goes to the right subtree R(v). 

Let us now describe such invalid probes. 

Consider one particular execution of OPT{G). Suppose that in this execution OPT(G) probes 
a/3. Algorithm ALGr can not make this probe because a/3 £ G Rgrd . But before that every probe 
of OPT(G) was a valid probe for ALGr. After OPT{G) probes a/3, the patience numbers of a and 
/3 will become equal for algorithms OPT(G) and ALGr, so afterwards every probe of OPT(G) will 
be a valid probe for ALGr. So in this case any possible loss of algorithm ALGr can be the edge 
a/3, but only if OPT(G) probed it successfully. Suppose now that OPT(G) did not probe a/3 in that 
execution. The only probes of OPT(G) which are invalid for ALGr are probe number t a = t R ° RD + 1 
of vertex a, and probe number tp = tg GRD + 1 of vertex /3. Thus in this case ALGr can lose only 
edges probed then. 

As before, we write performance of OPT as performance of ALGr plus penalty Rr: 

EOPT = EALGr + ER R . (5) 

From the above explanation we have 

ER R = F(OPT probes a/3)p a p + P(OPT does not probe a/3) x 

x (F(OPT takes a in probe number t a \OPT does not probe a/3) 
+P(OPT takes /3 in probe number tp\OPT does not probe a/3)). 

Let us write "take a in t a " instead of "OPT takes a in probe number t a " , and analogically with /3. 
Now we can write above equality shorter: 

ERr = P( probe af3)p a p + P(-iprobe a/3)(P(take a in t a |-iprobe a/3) + P(take /3 in ia|-iprobe a/3)). 

Putting this into ([5]) gives 

EOPT = EALGr + P( probe a(3)p a p 

+ P(-iprobe a/3)(P(take a in t a |-iprobe a/3) + P(take /3 in t^|-iprobe a/3)). 

Combining Multiplying inequality (jl|) by p a p, and equality ([6]) by 1 — p a p, and adding them give 

EOPT < p a/3 EALG L + p a/3 P(probe a/3) 

+ p Qj gP(-i probe a/3)(P(take a|-> probe a/3) + P(take /3|-i probe )) 

+ (1 -p afi )EALG R + (l-p al3 )P( probe a/3)p a/3 

+ (1 — Pa^)P( _, P r obe a/3)(P(take a in to,|-iprobe a/3) + P(take /3 in i^-iprobe a/3)). 



After grouping terms in above expression we obtain 

EOPT < Paf3 EALG L + (1 - Pa p)EALG R + p a pF(probe a/3)(2 -p aP ) (7) 

+p Q sF(-i probe a/3)(P(take ah probe a/3) -\ — P(take a in t Q |-iprobe a/3) 

Pa/3 

+P(take /3h probe a/3) + ~ Po/3 P(take /3 in igl-iprobe a/3)). 

Pa(3 

We state now the key lemma. 
Lemma 3.4. 

1 -Pap 



Pap 



P(take a in t a |-iprobe a/3) < P(OPT does not take a despite of t a probes|-iprobe a/3). 



Proof. Probability of taking a into matching is the sum of probabilities of taking each edge incident 
to a, so 

P(take a in i a |-iprobe a/3) = N P(OPT takes 07 in probe number t a |-iprobe a/3). 

7 € Adj (a) 

Edge 07 can be taken into matching, if we probed this edge, and the probe was successful, i.e. 

> P(OPT takes 07 in probe number t a |-iprobe a/3) 
7 eAdj(a) 

= > P(OPT probes 07 in probe number £ a AND probe is successful) -^probe a/3). 
7eAdj(a) 

Probe number t a is the last probe of vertex a regardless of its result. Thus its result and the fact that 
OPT does not probe a/3 are independent. This gives 

y F(OPT probes 07 in probe number t a AND probe is successful) -^probe a/3) 

7eAdj(a) , N 

= y, F(OPT probes 07 in probe number £ a |-iprobe a(3)p aj . 

7 eAdj(a) 

Function ^-^ is decreasing, and p a p is the greatest weight in the whole graph, so we get 

1 -Pap, 



PaP 



c (take a in t a \-> probe a/3) 



= y —pa-yFiOPT probes 07 in probe number t a |-iprobe a/3) 

rfw s Vap 

7 eAdj(a) 

< \ —p ai F(OPT probes 07 in probe number £ a |-iprobe a/3) 

7 eAdj(a) Pai 

= y, (1 ~~ Pa~jW{OPT probes aj in probe number £ a |-iprobe a/3) 
7 eAdj(a) 

= y^ P(OPT probes 07 in probe number t a AND this probe is unsuccessful] ->probe a/3). 

7 eAdj(a) 



The justification of the last equality is the same as in (|8|). 
Just like at the beginning 

> P(OPT probes «7 in probe number t a AND probe is unsuccessful! -iprobe a/3) 

7 GAdj(a) 

= F(OPT does not take a despite of t a probes | -iprobe a/3), 
so the lemma is proved. □ 

Corollary 3.5. 



(9) 



a in to,|-iprobe a/3) < ¥(OPT does not take a|-iprobe a/3) 

Pa/3 

Proof. It follows from the lemma 13.41 and the obvious inequality 
¥(OPT does not take a despite of t a probes| -iprobe a/3) < E{OPT does not take a|-iprobe a/3). 

□ 

Of course, lemma l3"T4l and corollary 13.51 are true with (3 in place of a. 

The event that OPT does not take a is opposite to "take a", so we will denote it as "-i take a", 
and analogically for (3. This means also that 

1 = P(take a|-i probe a/3) + P(->take a|-> probe a/3) 

= P(take p\-> probe a/3) + P(^take /3|-. probe a/3). 

Finally we get 

EOPT < p aP EALG L + (1 - Paf} )EALG R + p Q/3 P(probe a/3) (2 - p a/} ) by © 

+p Q , i gP(-i probe a/3)(P(take a\-> probe a/3) -\ — P(take a in t a j-iprobe a/3) 

Pa/3 

+P(take /9I-1 probe a/3) + ~ Pa/J P(take /3 in fcgl-iprobe a/3)) 

Pa/3 

< p Q/3 E,4LG L + (l-p a p)EALG R +p a pF(probe a/3)(2-p afi ) by [33] 

+Pq/3P( - ' probe a/3)(P(take ck|— ■ probe a/3) + P(-itake a\-> probe a/3) 
+P(take /3h probe a/3) + P(^take /3|-i probe a/3)) 
= p Q/3 E^LG L + (1 - p aP )EALG R + Pa pP(probe a/3) (2 - p afi ) + 2p Q/3 P(- probe a/3) by © 

< p Q pEALG L + (1 - p a p)EALG R + 2p a/3 P(probe a/3) + 2p a/3 E(^ probe a/3) 
= p aP EALG L + (1 - Pa/) )RALG R + 2p a/3 . 

Inductive assumption gives 

EALG L < EOPT(G Lgrd ) < 2EGRD{G Lgrd ) = 2EL GRD , and 

EALG R < EOPT(G Rgrd ) < 2EGRD(G Rgrd ) = 2ER GRD . 
Hence 

EOPT < p aP EALG L + (1 - Pa p)EALG R + 2 Pa/3 
< 2p aP EL GRD + 2(1 - p a/3 )ER GRD + 2p aP 
= 2p aP {EL GRD + 1) + 2(1 -p a p)ER GRD = 2EGRD(G). 
This completes the proof. □ 
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